🐿️ ScourBrowse
LoginSign Up
You are offline. Trying to reconnect...
Copied to clipboard
Unable to share or copy to clipboard
🌐 Distributed LLM Systems

Load Balancing, Cluster Management, Fault Tolerance, Scaling Strategies

LENS: Learning Ensemble Confidence from Neural States for Multi-LLM Answer Integration
arxiv.org·2d
🧠Large Language Models (LLMs)
Collaborative State Machines: A Better Programming Model for the Cloud-Edge-IoT Continuum
arxiv.org·4d
⚙️AI Infrastructure Automation
Large Language Models for Supply Chain Decisions
arxiv.org·4d
🧠Large Language Models (LLMs)
Efficient Differentially Private Fine-Tuning of LLMs via Reinforcement Learning
arxiv.org·3d
✨Model optimizations in LLMs
Adaptive Cluster Collaborativeness Boosts LLMs Medical Decision Support Capacity
arxiv.org·4d
🧠Large Language Models (LLMs)
Mitigating Resolution-Drift in Federated Learning: Case of Keypoint Detection
arxiv.org·2d
🧠Large Language Models (LLMs)
Using Containers to Speed Up Development, to Run Integration Tests and to Teach About Distributed Systems
arxiv.org·4d
⚙️AI Infrastructure Automation
Tensor-based reduction of linear parameter-varying state-space models
arxiv.org·2d
✨Model optimizations in LLMs
Federated Distributionally Robust Optimization with Non-Convex Objectives: Algorithm and Analysis
arxiv.org·3d
🔧Systems-level optimizations for LLM serving
SkyEye: When Your Vision Reaches Beyond IAM Boundary Scope in AWS Cloud
arxiv.org·4d
⚙️AI Infrastructure Automation
Vulnerability Mitigation System (VMS): LLM Agent and Evaluation Framework for Autonomous Penetration Testing
arxiv.org·4d
🔧Systems-level optimizations for LLM serving
MemTool: Optimizing Short-Term Memory Management for Dynamic Tool Calling in LLM Agent Multi-Turn Conversations
arxiv.org·4d
🤖Agents using LLMs
Oranits: Mission Assignment and Task Offloading in Open RAN-based ITS using Metaheuristic and Deep Reinforcement Learning
arxiv.org·5d
⚙️AI Infrastructure Automation
Toward the Autonomous AI Doctor: Quantitative Benchmarking of an Autonomous Agentic AI Versus Board-Certified Clinicians in a Real World Setting
arxiv.org·2d
📊AI Performance Profiling
Large Language Model Powered Automated Modeling and Optimization of Active Distribution Network Dispatch Problems
arxiv.org·4d
🔧Systems-level optimizations for LLM serving
Tractable Responsibility Measures for Ontology-Mediated Query Answering
arxiv.org·2d
🧠Large Language Models (LLMs)
Nearest-Better Network for Visualizing and Analyzing Combinatorial Optimization Problems: A Unified Tool
arxiv.org·3d
📊AI Performance Profiling
LoRA-PAR: A Flexible Dual-System LoRA Partitioning Approach to Efficient LLM Fine-Tuning
arxiv.org·5d
🧠Large Language Models (LLMs)
Data-Driven Stochastic Control via Non-i.i.d. Trajectories: Foundations and Guarantees
arxiv.org·2d
⚡Real-time AI Systems
Comparing Cluster-Based Cross-Validation Strategies for Machine Learning Model Evaluation
arxiv.org·3d
📊AI Performance Profiling
Loading...Loading more...
AboutBlogChangelogRoadmap